A Novel Architecture for Data Mining Grid Scheduler
نویسندگان
چکیده
In order to improve the performance of Data Mining applications, an effective method is task parallelization. The scheduler on Grid plays an important role to management subtasks so as to achieve high performance. We introduce an additional component that we call serializer, whose purpose is to decompose the tasks into a series of independent tasks according the directed acyclic graph (DAG), and send them to the scheduler queue as soon as they become executable with respect to the DAG dependencies. The experimental result demonstrates that the architecture has good performance. Key-Words: Scheduling Architecture, Knowledge Grid, Data Mining
منابع مشابه
Workflow-based Tasks Scheduling on Grid
Due to the distributed nature of data and the need for high performance, it makes Grid a suitable environment for distributed data mining. Since distributed data mining applications are typically data intensive, one of the main requirements of such a DDM Grid environment is the efficient workflow scheduling. We propose an architecture for a Knowledge Grid scheduler that results in the minimal r...
متن کاملPromoting performance and separation of concerns for data mining applications on the grid
Grid Computing brought the promise of making high-performance computing cheaper and more easily available than traditional supercomputing platforms. Such a promise was very well received by the data mining (DM) community, as DM applications typically process very large datasets and are thus very resource intensive. However, since the Grid is very dynamic and parallel data mining is prone to loa...
متن کاملGridMiner: An Infrastructure for Data Mining on Computational Grids
Knowledge discovery in datasets integrated into Grids is a challenging research task. These large datasets are being collected and accumulated across a wide variety of fields, at a dramatical pace. They are often heterogeneous and geographically distributed and globally used by large user communities. There are major challenges involved in the efficient and reliable storage, fast processing, in...
متن کاملSoPhIA: A Unified Architecture for Knowledge Discovery
This paper presents a novel architecture Soph.I.A (Sophisticated Intelligent Architecture), which integrates Knowledge Management and Data Mining into a unified Knowledge Discovery Process. Within SophIA Data Mining is driven by knowledge captured from domain experts. Knowledge Grid is briefly reviewed to envision the implementation of the proposed framework.
متن کاملDesign and implementation of a data mining grid-aware architecture
Current business processes often use data from several sources. Data is characterized to be heterogeneous, incomplete and usually involves a huge amount of records. This implies that data must be transformed in a set of patterns, rules or some kind of formalism, which helps to understand the underlying information. The participation of several organizations in this process makes the assimilatio...
متن کامل